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Abstract Quasi-Monte Carlo (QMC) methods are applied to multi-level Finite Ele- 
ment (FE) discretizations of elliptic partial differential equations (PDEs) with a ran- 
dom coefficient. The representation of the random coefficient is assumed to require a 
countably infinite number of terms. 

The multi-level FE discretizations are combined with families of QMC meth- 
ods (specifically, randomly shifted lattice rules) to estimate expected values of linear 
functionals of the solution, as in 117111811231 in the single-level setting. Here, the ex- 
pected value is considered as an infinite-dimensional integral in the parameter space 
corresponding to the randomness induced by the random coefficient. In this paper 
we study the same model as in l23l . The error analysis of ll23ll is generalized to a 
multi-level scheme, with the number of QMC points depending on the discretization 
level, and with a level-dependent dimension truncation strategy. In some scenarios, it 
is shown that the overall error of the expected value of the functionals of the solution 
(i.e., the root-mean-square error averaged over all shifts) is of order ff(h 2 ), where h 
is the finest FE mesh width, or ff(N~ l+s ) for arbitrary 8 > 0, where N denotes the 
Q\ . maximal number of QMC sampling points in the parameter space. For these scenar- 

ios, the total work for all PDE solves in the multi-level QMC-FE method is shown 
to be essentially of the order of one single PDE solve at the finest FE discretization 
level, for spatial dimension d > 2 with linear elements. 

The analysis exploits regularity of the parametric solution with respect to both 
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" the physical variables (the variables in the physical domain) and the parametric vari- 



Frances Y. Kuo 

School of Mathematics and Statistics, University of New South Wales, Sydney NSW 2052, Australia 
E-mail: f.kuo@unsw.edu.au 



5_l Christoph Schwab 

Seminar for Applied Mathematics, ETH Zurich, ETH Zentrum, HG G57.1, CH8092 Zurich, Switzerland 
E-mail: christoph. schwab@sam. math. ethz.ch 

Ian H. Sloan 

School of Mathematics and Statistics, University of New South Wales, Sydney NSW 2052, Australia 
E-mail: i.sloan@unsw.edu.au 



2 



Frances Y. Kuo et al. 



ables (the parameters corresponding to randomness). As in l23l . families of QMC 
rules with "POD weights" ("product and order dependent weights") which quantify 
the relative importance of subsets of the variables are found to be natural for proving 
convergence rates of QMC errors that are independent of the number of paramet- 
ric variables. Our POD weights for the multi-level QMC-FE algorithm are different 
from those for the single-level algorithm in (23). Conditions on the data of the prob- 
lem to achieve a certain rate of convergence coincide with the sufficient conditions 
obtained in [6] to achieve the same convergence rates for best A^-term approximations 
of solutions for the parametric PDE model. 

Keywords Multi-level • Quasi-Monte Carlo methods • Infinite dimensional 
integration • Elliptic partial differential equations with random coefficients • 
Karhunen-Loeve expansion • Finite element methods 

Mathematics Subject Classification (2000) 65D30 65D32 65N30 
1 Introduction 

This paper is a sequel to our work ll23l . where we analyzed theoretically the ap- 
plication of quasi-Monte Carlo (QMC) methods combined with finite element (FE) 
methods for a scalar, second order elliptic partial differential equation (PDE) with 
random diffusion. The diffusion is assumed to be given as an infinite series with 
random coefficients. As in l23l . we consider the model parametric elliptic Dirichlet 
problem 

-V ■ (a(x,y)Vu(x,y)) = f(x) in Del 1 ', u(x,y) = on dD , (1) 

for D C W 1 a bounded domain with a Lipschitz boundary dD. In the gradients are 
understood to be with respect to the physical variable x which belongs to D, and the 
parameter vector y = (yj)j>\ consists of a countable number of parameters yj which 
we assume, as in 11231 . to be i.i.d. uniformly distributed, with 

,e(4,i)« =:U. 

The parameter)? is thus distributed on U with the uniform probability measure fi (dy) — 
<8>;>i dyj = dy. The parametric diffusion coefficient a(x,y) in (Q3 is assumed to de- 
pend linearly on the parameters ys as follows: 

a(x,y) =a(x) + Y t yjWj(x) , xeD, yeU. (2) 

The y/j can either be Karhunen-Loeve eigenfunctions (see, e.g. ||3T1 ). or other suitable 
function systems in L 2 (D). As in l23l we impose a number of assumptions on a and 
y/j as well as on the domain D: 

(Al) We have a G L°°(D) and ||V0'Hz.-(B) < °°- 

(A2) There exist a max and a m \ n such that < a m i n < a(x,y) < a max for all x G D and 
ye U. 

(A3) There exists p G (0, 1) such that £j>i II Vj\\ P L °°t D ) < °°- 
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(A4) With the norm ||v|| w i.^ (D) := max{||v||i~( ), ||Vv|| £ -( fl )}, we have a £ W l -°°(D) 

and £_,->, ||y>llwi--(z>) < °°- 
(A5) The sequence yfj lS ordered so that || yi \\l°°(d) =_ II W2\\l°°(d) > ■ • • ■ 
(A6) The domain D is a convex and bounded polyhedron with plane faces. 

We now briefly comment on each assumption. Assumption (Al) ensures that the 
coefficient a(x,y) is well-defined for all parameters y £ U. Assumption (A2) yields 
the strong ellipticity needed for the standard FE analysis. Assumption (A3) is stronger 
than Assumption (Al). This assumption implies decay of the fluctuation coefficients 
y/j, with faster decay for smaller p. The value of p determined the convergence rate in 
the previous paper J23). Assumption (A4) guarantees that the FE solutions converge. 
Assumption (A5) allows the truncation of the infinite sum in (0 to, say, s terms. This 
assumption is not needed in this paper when the functions y/j satisfy an orthogonality 
property in relation to the FE spaces, see £|3.3l below. Finally, Assumption (A6) only 
simplifies the FE analysis and can be substantially relaxed. 

Our aim in this paper is to extend the QMC-FE algorithm of 1231 for the efficient 
computation of expected values of continuous linear functionals of the solution of ([1) 
to a multi-level setting so that the overall computational cost is substantially reduced. 
Suppose the linear functional is G(-) : Hq (D) h> M. We are interested in approximat- 
ing the integral 



The (single level) strategy in l23l was to (i) truncate the infinite sum in the expansion 
of the coefficient to s terms, (ii) approximate the solution of the truncated PDE prob- 
lem using a FE method with mesh width h, and (iii) approximate the integral using a 
QMC method (an equal-weight quadrature rule) with N points in s dimensions. The 
QMC-FE algorithm can therefore be expressed as 



where ui denotes the FE solution of the truncated PDE problem, andy' 1 ), . . . are 
QMC sample points which are judiciously chosen from the s-dimensional unit cube 
[— h,j] s - More precisely, the QMC rules considered in l23l are randomly shifted 
lattice rules; more details will be given in the next section. It was established in ll23l 
that the root-mean-square of the error/(G(w)) — Q s .N(G(u s h )) over all random shifts is 
a sum of three parts: a truncation error, a QMC error, and a FE error. For example, in 
the particular case where Assumption (A3) holds with p = 2/3 and/,G(-) £ L 2 (D), it 
was shown that the three additive parts of the error are of orders ff(s- { ), 0(N- l+s ), 
and &(h 2 ) = &(M h ), respectively, where M/, is the number of FE nodes and d is 
the spatial dimension. Assuming the availability of a linear complexity FE solver in 
the domain D (e.g., a multigrid method), the overall cost of the (single level) QMC-FE 
algorithm is 0{sNMh). There, as in the present paper, we assume that the functions 




(3) 
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and their (piecewise-constant) gradients are explicitly known, and that integration 
of any FE basis functions over a single element in the FE mesh is available at unit 
cost. 

The purpose of the present paper is the design and the error-versus-cost analysis 
of a multi-level extension of the single level algorithm developed in 1231 . The multi- 
level algorithm takes the form 

fiS(G(«)) := t {°K ~ u t\)) > ( 4 ) 

where S£ is a nondecreasing sequence of truncation dimensions, u s £ denotes the FE 
approximation with mesh width h( of the PDE problem with parametric input (fJI 
truncated at S{ terms, with the convention a^ -1 = 0, and Q S( ,N t denotes the (randomly 
shifted) QMC quadrature rule with N( points in s# dimensions. (For the practical 
form of the quadrature rule, including randomization, see d20b below.) Assuming 
again the availability of a linear complexity FE solver in the domain D, the overall 
cost of this multi-level QMC-FE algorithm is therefore 6 '(£f =0 s iNtMh,) operations. 
Again we use randomly shifted lattice rules, and we show that sg, Ni, and M%, enter 
the root-mean-square of the error I(G(u)) — Q^(G(u)) over all random shifts in a 
combined additive and multiplicative manner. Upon choosing S( and N( in relation 
to h(, appropriately at each level I, we arrive at a dramatically reduced overall cost 
compared to the single level algorithm. 

The general concept of multi-level algorithms was first introduced by Heinrich 
Ifl9l and reinvented by Giles 11411151 . Since then the concept has been applied in 
many areas including high dimensional integration, stochastic differential equations, 
and several types of PDEs with random coefficients. Most of these works used multi- 
level Monte Carlo (MC) algorithms, while few papers considered multi-level QMC 
algorithms. The multi-level QMC-FE algorithm (0]) proposed and analyzed here dif- 
fers in several core aspects from the abstract multi-level QMC framework proposed 
in B16II26I . It also differs from the multi-level MC approach which has recently been 
developed for elliptic problems with random input data of the general form dTJ in 12] 
[3ir51l30l[35l . The model considered here, as in ll23l . is infinite-dimensional. Previous 
treatments of infinite-dimensional quadrature include 11 611 24 11261 with QMC meth- 
ods, l20l with MC methods, and ll29l with Smolyak (or sparse-grid) quadrature. 

There is an important special case where the functions yfj satisfy an orthogonality 
property in relation to the FE spaces, see d28l l ahead. In this case there is no dimen- 
sion truncation error at any level, that is, with S( chosen in an appropriate way we 
have u£ = u\ lf . Furthermore, due to the special structure of the expansion of the co- 
efficient a(x,y), the overall cost is only <?(J^T =0 A^A/>,, log(M/,,)) operations. To have 
this orthogonality property we need multiresolution function systems; examples are 
given in ^3 . 3 1 We emphasize that the eigenfunction system of the covariance operator 
does not have this property. 

One of the main findings of the present paper is that the error analysis of the 
multi-level QMC-FE algorithm requires smoothness of the parametric solution si- 
multaneously with respect to the spatial variable x and to the parametric variable y. 
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Another key point is that we require decay of stronger norms of the fluctuation coef- 
ficients y/j compared to Assumptions (A3) and (A4): 

(A7) For p as in (A3), there exists q e [p, 1] such that L;>i || Vy'llvpWo) < °°' 

For the multi-level QME-FE algorithm the convergence rate will be determined by 
the value of q in (A7) rather than by the value of p as for the single level algorithm 
in El. 

As in most modern analyses of QMC integration in high dimensions, we use pa- 
rameters Yu< known as weights, to describe the relative importance of the subset of the 
variables with labels in the finite subset u C N. (These weights are to be distinguished 
from quadrature weights in, e.g., Gaussian quadrature formulas.) In l23l the weights 
were chosen to minimize a certain upper bound on the product of the worst case error 
and the norm in the function space, yielding a special form of weights called "POD 
weights", which stand for "product and order dependent weights": 

Yu = r lul Y\ 7 j , (5) 

where |u| denotes the cardinality (or the "order") of the set u. These weights are then 

determined by the two sequences: by fo = 1 , , T2 , T3 , . . . and by j\ , 72 , 73 , The 

error bound obtained in the present paper is more complicated than the result in ll23l 
due to the multi-level nature of the algorithm, but we follow the same general prin- 
ciple for choosing weights. It turns out that the "optimal" weights (in the sense of 
minimizing an upper bound on the overall error) for the multi-level QMC-FE algo- 
rithm are again POD weights (0, but they are different from the POD weights for 
the single level algorithm in 11231 . In any case, fast CBC construction algorithms for 
randomly shifted lattice rules are available for POD weights, see El for a recent 
survey, as well as l32ll2Tll9ll27ll28ir7irTTI . 

The outline of this paper is as follows. In [J2] we introduce the function spaces 
used for the analysis and summarize those results from 11231 that are needed for this 
paper. In fj3] we prove the main results required for the error analysis and combine 
them to obtain an error bound for the multi-level QMC-FE algorithm. Finally in [j4] 
we give conclusions. 

2 Problem Formulation and Summary of Relevant Results 

2. 1 Function Spaces 

First we introduce the function spaces from |23| which will be used in what follows. 
Our variational setting of ([TJ is based on the Sobolev space V = (D) and its dual 
space V* = H^ 1 (D), with pivot space L (D), and with the norm in V given by 

l|v||v := l|Vv|| £ 2 (fl) . 

We also consider the Hilbert space with additional regularity with respect to x, 

Z' := {v€V:Av eH- l+l (£>)}, 0<f<l, (6) 
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with the norm 

\Hv ■= (l|v|ll 2(D) + ||4v||^ I+((0) ) 1/2 , (7) 

where, for — 1 < r < 2, the H r (D) norm denotes the homogeneous 7/ r (D)-norm which 
is defined in terms of the L 2 (D) orthonormalized eigenfunctions (pi G V and the 
eigenvalues X in the corresponding spectrum E of the Dirichlet Laplacian in D by 



lei. 



Here, and in the following, we denote by (•, •) the bilinear form corresponding to the 
L 2 (D) innerproduct, extended by continuity to the duality pairing H r (D) x H~ r {D). 
Standard elliptic regularity theory (see, e.g. Ifl3l ) yields the inclusion Z' C H^'(D), 
and for convex domains D and for t = 1 we have Z 1 = H 2 (D) DH^ (D). As already 
seen in [JT] we will also make use of the norm 

\\ v \\w<-~(D) : = max {IMIz.~(D),H V v||L~(D)} • 

The integrand in (0 is F(y) := G(u(-,y)). To analyze QMC integration for such 
integrands, we shall need a function space defined with respect to y, namely, the 
weighted and anchored Sobolev space Wy, which is a Hilbert space containing func- 
tions defined over U, with square integrable mixed first derivatives. More precisely, 
the norm is given by 



I 1 , 



2 \ 1/2 

dy u , (8) 



where the sum is over all subsets u C N with finite cardinality lul, denotes the 
mixed first derivative with respect to the variables yj with j G u, and (j u ;0) denotes 
the vector whose j'th component is yj if j G u and if j £ u. The "anchor" in this case 
is (0,0, . . .), the center of the cube U. 

Since our multi-level QMC-FE algorithm makes use of the FE solution of the 
truncated PDE problem to, say, s terms, we will consider also an s-dimensional vari- 
ant of the space Wy, denoted by W s y, whose norm || • ||^ is defined by replacing 
the sum over all finite subsets u in ([8]) by a sum over all subsets of the first s indices 
(i.e., |u| < °° becomes u C {1, . .. ,s}). For a function F that depends on infinitely 
many variables, if we define F s (yi,. .. ,y s ) := F(y\, ... ,y s , 0,0,- ■■) by anchoring the 
components beyond dimension s at 0, then we have \\F s \w sy = 1 1 F s \iPy < 1 1 ^Un- 
weighted spaces were first introduced by Sloan and Wozniakowski in ll33l . and 
by now there are many variants, see e.g. 11211341 . As in 11231 . we have taken the cube 
to be centered at the origin (rather than the standard unit cube [0, l] s ), and the anchor 
is at the centre of the cube (rather than at a corner of the cube). Moreover, we have 
adopted "general weights": there is a weight parameter y u associated with each group 
of variables y u — (yj)jeu with indices belonging to the set u, with the convention that 
7b = 1. Later we will focus on "POD weights", see ©. As in [23], these POD weights 
arise naturally from our analysis for the PDE application. 
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2.2 Parametric Weak Formulation 

As in 0231 . we consider the following parameter-dependent weak formulation of the 
parametric deterministic problem (Q]): for / G V* andy £ U , find 

u(-,y)eV: 6(y;u(-,y),v) = (/,v) Vv e V , (9) 

where the parametric bilinear form b(y; w, v) is given by 

b(y;w,v) := / a(x,y) Vw(x) ■ Vv(x)dr , Vw,vGV, 
Jd 

It follows from Assumption (A2) that the bilinear form is continuous and coercive 
on V x V , and we may infer from the Lax-Milgram Lemma the existence of a unique 
solution to (|9]l satisfying the standard a-priori estimate. Moreover, additional regu- 
larity of the solution with respect to x can be obtained under additional regularity 
assumptions on / and the coefficients a(-,y). 

Theorem 1 ((23j Theorems 3.1 and 4.1]) Under Assumptions (Al) and (A2), for 

every f 6 V* and every y £ U, there exists a unique solution u(-,y) £ V of the para- 
metric weak problem which satisfies 

||«(-,y)||v < ^ • do) 

^min 

If, in addition, f G H~ l+ '(D)for some < t < 1, and if Assumption (A4) holds, then 
there exists a constant C > such that for every y £U, 

M;y)h <C||/|| fl -m (D)) (11) 

with the norm in Z' defined by O. 



2.3 Dimension Truncation 

Next we summarize a result from l23l needed for estimating the dimension truncation 
error. Given s € N andy G U, we observe that truncating the sum in (fJJ at s terms is the 
same as anchoring or setting yj = for j > s. With { 1 : s} standing for {1, ...,*}, we 
denote by u s (x,y) := u(x, Cy{i- 4 .};0)) the solution of the parametric weak problem ^ 
corresponding to the parametric diffusion coefficient (f2]i when the sum is truncated 
after s terms. As observed in 11231 . it will be convenient for the regularity analysis of 
([TJ and for the QMC error analysis to introduce 

bj = , y > 1 . (12) 

^min 
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Theorem 2 fl[23l Theorem 5.1]) Under Assumptions (Al) and (A2), for every f G 
V*, every G G V*, every y G 1/ and every sGN, f/ze solution u s (-,y) = m(-, (y/i-a;0)) 
of the truncated parametric weak problem (|9]l satisfies, with bj as defined in H2\ . 

IK-,y)-«^ )y )lk<cttl £ 

flmin ; > s+ l 

|/(G W )-7 S (G( M ))| < C II/II ^ I|G( - )II " ( £ *Y (13) 

"mm \ j> s+ i J 

for some constants C,C > independent of s, f and G(-). /« addition, if Assump- 
tions (A3) a«c/ (A5) /zoW, f/ze« 

2.4 Finite Element Discretization 

Let us denote by {V/, }/, a one-parameter family of subspaces V), C V of dimensions 
Mf, < oo. Under Assumption (A6), we think of the spaces V/, as spaces of continuous, 
piecewise-linear finite elements on a sequence of regular, simplicial meshes 3^ in D 
obtained from an initial, regular triangulation ,% of D by recursive, uniform bisection 
of simplices. Then it is well known (see, e.g., J4|) that there exists a constant C > 
such that, as h —> 0, with the norm in Z' defined by (0, 

inf ||v-v A ||v < Ch' \\v\\ z < for all v G Z' , < f < 1 . 

For any y G U, we define the parametric FE approximation »/,(-, y) as the FE solution 
of the parametric deterministic problem: for / G V* andy G U, find 

uh{-,y) e V/, : fo(y;wft(-,y),v ft ) = (/>/,) Vv/, g v h . 

Below we summarize the results from 11231 regarding the FE error. We remark that, 
by considering the error in approximating a bounded linear functional, ff{h 2 ) con- 
vergence for /, G(-) G L?{D) follows from an Aubin-Nitsche duality argument. 

Theorem 3 ((23l Theorems 7.1 and 7.2]) Under Assumptions (Al), (A2), (A4), and 
(A6), for every f G V* and every y G U, the FE approximations «/,(•,)') are stable in 
the sense that 

K(-,y)llv<^. 

"min 

Moreover, for every feH- l+, (D) with0<t< 1, every G(-)g// 1+ ' with < f' < 1, 
and for every y G t/, f/iere /ioZc/ ?/ie asymptotic convergence estimates as h — > 

||H(-,y)-«*(-,y)||v < C//||«(-,y)||z' < C// \\f\\ H -^ [D) (15) 

|G( M (-,y))-G(« A (-,y))| < Ch* \\f\\ H -i +t[D) \\G(-)\\ H _ l+t , {D] , (16) 
where < T :=? + ?'< 2, ant/ where C, C > are independent ofh andy. 
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2.5 QMC Approximation 



As in 11231 . in this paper we will focus on a family of QMC rules known as randomly 

I lis 

2' 2-1 ' 



shifted lattice rules. For an integral over the s-dimensional unit cube [— } 



a realization of an Appoint randomly shifted lattice rule takes the form 

QM^F) :=lfF(frac(|+4)-(l,. ..,!)) , 

where z G If is known as the generating vector, which is deterministic, while A is the 
random shift to be drawn from the uniform distribution on [0, l] s , and frac(-) means 
to take the fractional part of each component in the vector. The subtraction by the 
vector (j, . . . , j) describes the translation from the usual unit cube [0, l] 4 to [—j, j] s - 
Good generating vectors z for POD weights can be constructed using a component- 
by-component algorithm, at the cost of ff(N\ogNs + Ns 2 ) operations, such that the 
"shift averaged" worst case error in the weighted Sobolev space W s ,y achieves a 
dimension-independent convergence rate close to ff(N~ l ). Moreover, the implied 
constant in the big-0 bound can be independent of s under appropriate conditions 
on the weights y u . A short summary of these results, together with references, can be 
found in l23l Section 2]. A more detailed survey can be found in |22|. For the purpose 
of this paper, we only need the following bound on the root-mean-square error. 

Theorem 4 ((23j Theorem 2.1]) Let s,N G N be given, and assume F G W s ,yfor 
a particular choice of weights y = (y u )- Then a randomly shifted lattice rule can 
be constructed using a component-by-component algorithm such that the root-mean- 
square error satisfies, for all X G (1/2, 1], 

/ \ 1/(2A) 

n%{F)-Qs,N^n 2 } < e ^[p(a)] |u| [<p(N)]- 1/m \ 

WuC{l:.v} / 



v s,y 



where E[-] denotes the expectation with respect to the random shift which is uniformly 
distributed over [0, l] 1 , (p{N) = |{1 < z < N — 1 : gcd(z,Af) = 1}| denotes the Euler 
totient function, 

and £(x) = YX=\ k * denotes the Riemann zeta function. 



A rate of convergence arbitrarily close to ff(N~ l ) comes from taking A in the 
theorem close to 1/2. However, note that p(A) — > °o as X — > (1/2)+, making the 
convergence of the sum over u more and more problematic as X comes closer to 1/2. 
For that reason we shall leave A as a free parameter in the subsequent discussion. 
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3 Multi-level QMC-FE Algorithm 

3.1 Formulation of the Multi-level QMC-FE Algorithm 

We are now ready to formulate our multi-level QMC-FE algorithm for approximating 
the integral (01. Let 

h e = 2- e h for £ = 0,1,2,.... 

We suppose that we are given a nested sequence {V/, f }f>o of finite-dimensional sub- 
spaces of V of increasing dimension, 

M,, <M hl <---< M h[ := dim(V /lf ) x 2 M -> °° as I ->• °° , 

where a„ x b„ means there exist c\ ,C2 > such that c\b„ < a„ < C2b„. In the multi- 
level method we specify a maximum level L, and with each level I = 0, . . . ,L of (uni- 
form) mesh refinement we associate a randomly shifted lattice rule Q S( jv ( which 
uses Nt points in sg dimensions. We assume moreover that the sequence {s(}( = o i 
of active dimensions is nondecreasing, i.e., 

So<si<---<st<s L , (18) 

which implies that the corresponding sets of active coordinates are nested. To simplify 
the ensuing presentation, we write (with slight abuse of notation) 

V( = V h( , 3t = 2ht, Qi = QsiM > h=h t , ut = u)l r M t =M h( . 

Here by we mean the FE solution of the truncated problem with S( terms in the 
expansion, which is the same as u/^iy^.^^O). For convenience we define m_i := 0. 
Each lattice rule Q( depends on a deterministic generating vector zi <E If' , but we 
shall suppress this dependence in our notation. A realization of the lattice rule Q( for 
a draw of the shift Ag £ [0, 1] S( applied to a function F will be denoted by Q^(Af,F). 
The random shifts Ao, . . . ,Ai are drawn independently from the uniform distribution 
on unit cubes of the appropriate dimension. With these notations, a single realization 
of our multi-level QMC-FE approximation of I(G(u)) is given by 

L 

(£(A,;G(u)) := £&(^;G(«<-h/-i)) , (19) 

where A* := (Ao,... ,Al) will be referred to as the "compound shift": it comprises 
all s* := Y,{=o s e components of the random shifts Ag. Equivalently, A * is drawn from 
the uniform distribution over [0, l] s *. 

The randomly shifted version of dT9b that we use in practice makes use of mg i.i.d. 
realizations of the level-^ shift Ag, thus takes the form 

L 1 mt c\ 

Q L (G(u)) := £ — Y^QtiA Y;G(u t -ut-i)) . (20) 

In the subsequent analysis we work with exact expectations of ( fT9l , but in the final 
section we return to (f2Qb . and there justify choosing mg to be a fixed number inde- 
pendent of I. 
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3.2 Error Analysis of the Multi-level QMC-FE Algorithm 

Using linearity of /, I(, Q( and G, we can express the error as 

L 

I(G(u))~Q I ;(A Sf -,G(u))=I(G( u ))-Y,Qe(AfMue-»e-i)) = T i +T 2 (A*), 

e=o 

where 

L 

7j := I(G(u)) - £ h{G{u t - u t -i)) , (21) 

i=Q 

L 

T 2 (A*) := Y,(!t-Qt(AtM G { u t- u t-l)) > 

where we introduced the operator notation Q(A)(F) := Q(A\F). Since a randomly 
shifted lattice rule is an unbiased estimator of the original integral, it follows that the 
mean-square error for our multi-level QMC-FE method, i.e., the expectation of the 
square error with respect to A * G [0, l] s *, simplifies to 

E[|/(G(«))-G*(-;G(«))| 2 ] = 7f +E[r 2 2 ] , (22) 

where the cross term vanishes due to Epi] = 0, and we have 

Wi] = £E[|(/ f -e,(-))(G(Mf-M^i))| 2 ] , (23) 

1=0 

where the expectation inside the sum over index I is with respect to the random shift 

A ( e[0,l} S1 '. 

First we estimate T\ given by KT\\ . Since Uf> — only depends on the first 
Si dimensions, we can replace I((G(ue — w^-i)) by I(G{ui — ui-\)), and hence the 
expression ( f2TT > simplifies to 

T\ = I(G(u - u L )) = I(G(u - u h[ ))+I(G(u hL - u s h L L )) . 

Here m/, l — u s £ is the error that we incur in the FE approximation by omitting in the 
coefficient expansion (f2]i all terms with indices j > sl. As we will show in Theorem[5] 
below, this dimension truncation error vanishes for certain types of (multiresolution) 
coefficient expansion ©. To allow for this, we introduce a parameter 9l S {0, 1}, and 
arrive at the estimate 

|7i| < sup|G(u(- J y)-«* 1 .(- J y))l + dL\I(G(u hL -uf ))\ 
yeu 

< Chi\\f\\ H - l+t{D) ||G(-)||^ (fl) + QlC mW ' l|G(OI|vt ( £ bX, 

"mm \ j>s L +l / 

(24) 

where for the first term we applied ( TToT l from Theorem[3] and for the second term we 
used ( fT~3b from Theorem|2]but adapted to the FE solution u% L instead of u. 
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Next we estimate E[r 2 2 ] given by ( f23b . We have from Theorem[4]that 

L ( \VA 

Wi\<Y.\ E ^[p(^)] |u| [<p(^r 1/ "llG(<-<r' 1 )llk y . (25) 

e=o \0^uc{i :jf } / 
To estimate each term in d25b for £ =^ 0, we write 

l|G(< -<:;)lk f , r < llG(< -<_,)lk,, r + IIG^, -u^)!!^ . (26) 

In i]3.4l ahead, we bound the two terms in d26l i separately, and then return to complete 
the error analysis in £]3.5I Note that the second term in d26b vanishes if 5/ = It 
also vanishes in the special case when, for all I > 1 and an appropriately chosen in- 
creasing sequence % we have « A = «^ ( = M/, fl . This can happen when there is a 
special orthogonality property between the functions yfj in the representation (f2]i and 
the FE spaces V(. We discuss this very important special case in the next subsection. 



3.3 A Special Case with an Orthogonality Property 

In this subsection we suppose that the sequence y/j has properties usually associ- 
ated with a multiresolution analysis of L 2 (D), as shown in the Haar wavelet example 
below. For this purpose it is useful to relabel the basis set with a double index, as 

Wj :;>!} = {<:«> 0, m e /„} , (27) 

where the first index n indicates the (multiresolution) level, and the second index 
m G J„ indicates the location of a level-n basis function within D, with J n denoting 
the set of all location indices at level n. We suppose that all basis functions yi" n at 
level n are piecewise polynomial functions on the triangulation ,%, and have isotropic 
support whose diameter is of exact order h n , implying \J n \ x 2 dn . 

Definition 1 Let S°(D, ,9) and S l (D, ST) be the subspaces defined by 

S°(D, ST) := {v £ L 2 {D) : v\ K £ P°{K) for all K G £?} , 
S l (D, ST) := {v £ Hi (D) : v\ K e P l (K) for all K £ ^} , 

where P r {K) denotes the space of polynomials of degree less than or equal to r on 
the element K. We say that the set { y'm }n>o,meJ„ has the k- orthogonality property, for 
k 6 {1,2}, with respect to the triangulations : £ > 0} if for all £ > we have 

/ K(*Wjc) dx = for all n > £ + k , m £ J„ , and z f e 5°(Z), 3T t ) , (28) 

and y/'^ e S k ~ l (D, 3z+k-\) for all n < £ + k - 1, m e /„, and diam(supp( \j/" n )) x /z„. 

A necessary condition for d28l to hold is that the functions y/'^ for « > k have the 
vanishing mean property, that is 

/ V / "i( JC ) = for all « > k and all ;7z e / n • 
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Example 1 (Haar Wavelets) We describe here the simplest case, of Haar wavelets for 
a one-dimensional domain D = [0,a], with a some positive integer greater than or 
equal to 2. In the Haar wavelet case we may take, for m = 0, . . . ,a — 1, 



where d" n is a sequence of nonnegative scaling parameters, y/(x) is 1 for x £ [0, 1), 
— 1 for x £ [1,2), and otherwise. The family {vO forms an orthogonal basis of 
L 2 ([0,a]) if d" n > 0. We remark that the choice d" n = 2^ 1 ^ 2 which is well-known to 
imply orthonormality of the \f/" n in L? ( [0, a] ) is inconsistent with (Al), and is therefore 
excluded. 

For the finite element space Vo we take the piecewise-linear functions vanishing 
at and a. This space is spanned by the hat functions centered at 1,2, ...,a — 1. 
The spaces V( are then the piecewise-linear functions on [0,a] vanishing at and a, 
spanned by the hat functions centered at multiples of 2~f Correspondingly, ,9( is 
the mesh consisting of the multiples of 2~ ( , and the elements K( are the intervals of 
length 2~ e between the mesh points. 

With this definition of the multiresolution sequence { y/" n } has the ^-orthogonality 
property with respect to Sfy with k = 1, for all £ > 0. For example, for £ = and 
n = 1 ,m = we have, with zo E S°([0,a], .%) and c := Zo|[o,i]> 



Haar wavelets do not satisfy Assumption (A4), since for (A4) to hold the basis func- 
tions need to be Lipschitz continuous. A piecewise-linear fc-orthogonal basis set 
with k = 2 in dimension d = 1 is constructed, for example, in O. For detailed con- 
structions of ^-orthogonality basis sets with k = 2 and d > 1, see Il8l l25l : for the case 
k = 1 and d > 1 see (2j Section 5]. 

In the following theorem, we show that there is no truncation error at any level 
for our multi-level algorithm under ^-orthogonality if the dimension for truncation 
Si is chosen appropriately at each level. To achieve this, we employ a one-to-one 
mapping of the indices between the functions y/j and y/'^ in 1271 : instead of ordering 
the functions as in Assumption (A5), we index j according to a level-wise grouping 
so that the functions { \$ n } m ej come before the functions { yr} n } m ej l , followed by the 
functions { yi} n },„ej 2 > an d so on - Correspondingly, we employ the same index mapping 
between yj and y" n for the components of y. 

Theorem 5 Let {yr" n : n > 0,m S J„} be a multiresolution basis set for the domain D, 
with \ J n \ x 2 d ", which has the k-orthogonality property with k £ {1,2} with respect to 




forx G [m,m+ 1) , 
otherwise , 



and for « > 1 , 



VCW :=<Cy(2 n JC-2m), m = 0,...,2"- 1 a-l, 
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the triangulations : I > 0}. Let {yj : j > 1} = {y" n : n > 0,m € J n } denote the cor- 
responding parameters under the level-wise relabelling ( 127b so that the parametric 
coefficient in d2j can be represented in the form 

a(x,y) =a(*) + £ £ 

n=0m£j„ 

Let 

t+k-l 

*/:= E W- ( 29 ) 
77zen x 2 ^ x M/, f , and for all £>0 we have 

u h = u h e ■ ( 3 °) 

Moreover, the cost for exact evaluation of the Finite Element stiffness matrix for the 
parametric coefficient a(x,y) at meshlevel £>0for any given y € U is 6'{Mi H log(M/, ( )) 

Proo/ There holds VV e C 5°(D, for all £ > 0. Thus, for all £ > and for ev- 
ery ve,wg 6 V/5, we have Vw( ■ Vv# G S°(D, ^). The ^-orthogonality property (l28l i 
therefore implies for all £ > and for all , vt^ <E 

^l" 1 I ^vCwWrVv.d* (31) 

«=0 rngJ^ / 

The assertion ( f30b then follows from the uniqueness of the FE solutions. 

To show the assertion on the cost, for given y we denote by 2? £ (y) the Mi x Afy stiff- 
ness matrix of the parametric bilinear form b(y\ ■, •), restricted to V( x Vi, where V( = 
span{0/ : 1 < i < M(}, with 0/ denoting the nodal hat basis functions of S l (D, ^f). 
By ^-orthogonality of the y/^, we have (|3T1 i. and for each 1 < < Me = dim(V^) = 
&{2 de ) there holds 

B ( iy)n> = b(y {hse y,tf, = jf (P e+k -ia(x,y))V<l>f ■ V^dx , (32) 

where P( + i < _ia(x,y) denotes the truncated expression for a(x,y) appearing in OTb . 
The matrix B^(y) is sparse: it has, due to the local support of the hat functions (j)f and 
due to the construction of the sequence {^e}e>o of meshes, at most (?(M() nonvan- 
ishing entries (l32l . 

Now consider the cost for the exact evaluation of any matrix entry (B (y))n' ^ 0. 
Given £, i, i' , and for a given n < £ + k — 1, it follows from the assumption on the 
support of Ym m at there are only many functions \f/" n such that f D y/",(x) V0 ; f • 
V0v dx =^ 0. Thus the cost for evaluating (B e (y))iji ^ is 6(£ + k — 1 ), which yields 
that the total cost for evaluating the sparse matrix is G"(M(£) = G{M^ log(Afy)) oper- 
ations. □ 
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3.4 Key Results 

In the error analysis of the (single level) QMC-FE method, we established in l23l 
regularity results for the parametric solutions. In the present multi-level QMC-FE er- 
ror analysis, we first establish stronger regularity of the PDE solution simultaneously 
with respect to both x and y. The result shown is actually more general than required 
in this paper: our result covers partial derivatives of arbitrary order. To state the result, 
we introduce further notation: for V = (Vj)j>i £ Nq , where No = N U {0}, we define 

|v| := V] + Vi H , and we refer to V as a "multi-index" and |v| as the "length" of V. 

By 

£:= {VGl< : |V|<°°} 

we denote the (countable) set of all "finitely supported" multi-indices (i.e., sequences 
of nonnegative integers for which only finitely many entries are nonzero). For VG? 
we denote the partial derivative of order V £ 5 of u with respect to y by 

W ^ |V| 

°y u ■- avi 3V 2 u ■ 

VI <o>2 ' ' ' 

Theorem 6 Under Assumptions (Al) and (A2), for every f £ V*, every y £ U and 
every V £ 5, the solution u(-,y) of the parametric weak problem (|9]l satisfies 

\j>i / mm 

where bj is as defined in (1121) . If, in addition, f £ H~ i+ '(D)for some < t < 1, and 
if Assumption (A4) holds, then for every K £ (0, 1] there holds 

H v «(-»y)L ^ c ' v l ! (u h ?) \\f\\H-^ { D) ■ (34) 
\j>i j 

where 

h :=bj + KQ{\\^Yj\\ L . (D) +B\\Yj\\ L . (p) ) , j>l, (35) 
and the constants B and Q are, for < t < 1, defined by 

1 IIHIff-'+'CD) 

B := sup||Va(-,z)|| L - (D) <oo, C, := sup — ^ < oc . (36) 

fl min z£U weI?(D) \\ W \\L 2 (D) 

In (134b we have C < Ck~ 1 with C > independent of K. 

Proof Assertion ( f33l > was proved in [6 Theorem 4.3]. The proof there was based on 
the observation that, for every v £ V, y £ U and V £ ^ with |v| 7^ 0, (|9|l implies the 
recurrence 

(a(;y)V(d y v u(- 7 y)),Vv)+ £ Vj (yj V(^~ ej u(;y)) , Vv) = , (37) 

;'6supp(v) 
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where ej G 5 denotes the multiindex with entry 1 in position j and zeros elsewhere, 
and where supp(v) := {j G N : Vj ^ 0} denotes the "support" of V. Taking v(x) = 
dyu(x,y) G V in (f37T > leads to 



\d y V u(;y)\\y < £ V,^||^ M (. !y )|| V 
;'Gsupp(v) 



(38) 



from which d33b follows by induction. 

Assertion d34l i was proved in |6| Theorem 8.2] for the case t = 1. For complete- 
ness we provide a proof for general t here. We proceed once more by induction. The 
case |v| = is precisely (fTTT) and is already proved in (23, Theorem 4.1]. To obtain 
the bounds for |v| ^ 0, we observe that, trivially, for every V£? and for every y G U, 
the function dyii(-,y) is the solution of the Dirichlet problem 

-V.(a(-,y)V(^ v «(-»y))) = -8v(;y) in D, d^u(-,y)\ dD = , (39) 

with 

g v (.,y) := V-(a(;y)V(d y v u(;y))) = Va(-,y) • V(^ v «(-,y)) + a(-j)A(d y *u(;y)) . 

Here, we used the identity 

V-(a(x)Vw(jc)) = a{x)Aw(x) + 'Va(x)-Vw(x) , (40) 

which is valid for a G W 1,0O (D) and for any w G V such that Aw G L 2 (D). 

The assertion (l34i l will follow from (fTTl i. which implies for the solution of prob- 
lem ( 1391 the bound 



|dvV,y)b <c||g v (-,y)l| ff - 



'(D) 



(41) 



It remains to establish bounds for ||^v(-,y)||//-i+'(£))- We recast ( f37T > in strong form 
and obtain from d39l l, for every y G t/ and for every v G H (D), 

l(ffv(-,y),v)| = |(V.(a(.,y)V(^ v «(. iy ))) ,v)| 



< 



£ V; (Vvo- ■ V(# %*CoO) + Wj A {d; ej u(.,y)) , 

;'Gsupp(v) 

"v ¥j (.).V(d;-^u(.,y)) + Yj(.)A(4- ej u(;y)) 



H- l +'(D) 



l v llffl-f(0) 



y'GsuppfV) 

Dividing by ||v|| ff i-tp) and taking the supremum over all v G H l ~'(D) yields 



\\gv(;y)\\ H -i + >(D) < I y, ( I|VyoIU-(d) v(a; 

;'Gsupp(v) 



H-!+'(£)) 

V0lli-(D)ll^(^~ '"(-00)11 

. (42) 
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To bound the second term on the right-hand side of d42l) . we write ( f39b with V ej 
in place of V, for every y G U, in the form 

-a(;y)A(d;- ej u(;y)) = Va(-,y) ■ V(^ M (-,y)) -g v - ej (;y) , (43) 
using again ( 140) . This implies, for every y G f/, the estimate 

||4(^u(-,y))|| fl _i4, {fl) < — ||RHS of e3|| ff -i +f(D) 

^min 



< 



1 



sup || Va(-,z) || £ -( fl) ) ||V(<? y v e; M(-,y))|| H -i+, (D) + ||§v- ej (-,>')||//-i+ f ( D ) 



<fiC r ||^ V e 'u{-j)\\ v + \\gv-ej(;y)\\ H -i+'(D) » 

^min 

where B and Q are as in d36b . We insert this bound into d42b to obtain 

IM-,y)||ff-i-w (i ,) < E v /[Q(ll v ^-||L-(fl)+B|l^lli»(o))ll^"(- 1 y)llv 

y'esupp(v) 



-feil|gv-^(-,y)|| 7? -l+»( ) 



(44) 



This recursive estimate for ||gv(',y)||#-i+>(D) has structure which is similar to the 
bound ( f38l > for || w (- ,y) 1 1 v'- We therefore multiply (l44b by K > and add it to 
to obtain 

ll^"(^y)llv + K||Sv(-,y)llff-i-w(D) 

< E V J b j\\\ d y' eiu (-j)\\v + K\\gv-e j {-,y)\\ H -^'(D) 
;'€supp(v) 

+ E V J KC <{\\VVi\\L~(D)+B\Wj\\ L ~ {D) )\\dy~ ei u(-j)\\ V 

yesupp(v) 

v-e. 



< E vjbj 

yesupp(v) 



\dy ' u(-,y)\\v + K\\gv-ej(-,y)\\ H -i+>(D) 



(45) 



where bj is as in (1351 1. By Assumption (A4), we have L;>i bj < °° for any choice of 
K > and for any B. 

To establish (1341 it remains to observe that the estimate ( l45l l has the same structure 
as (l38l ). with the sequence {fr/} in place of {bj}. For |v| = 0, we find using ( TTOb of 
Theorem[T]and go = — / that 



ll"(-,y)llv + K||*ob-i-K(D) < — ll/l|v + K||/||fl-i-W 



(D) 



The same induction argument used to establish ( 1331 applied to the recursive estimate 
( f45l > implies for all V £ 5> for every y G E/ and for every K G (0, 1] 



K||gv(-,y)||tf-i+<(fl) < l|^«(-,y)llv + K||gv(-,y)||fl-i+f(D) 



'(D) 
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where C, := sup 11 , eH -i+,( D )(||vv|| // -i( D )/||w|| // -i + ,( D )) < °°. Now (O follows from 



To bound the first term in (126l i we need the following result. 

Theorem 7 Under Assumptions (Al), (A2), (A4), and (A6), for every f E H~ l+, (D) 
with < t < 1, every G(-) E //~ 1+f (D) witfi < /' < 1, every K E (0, 1], and every 
s E N, we have 

\\G(u°-u s h )\\^ y 

iisii ii^mii f V [(N + l)!] 2 nyeu^V /2 
< C^iwII/llff-i+^ojIlGCOI^-i+^oJ J- ^ — — ' 

\uC{l :i } Ai / 

where < T := f + f ' < 2, where bj is defined in ( 135t , a«t/ where the constant C > 
is independent ofs. 

Proof Let g e (D) denote the representer of G(-) £ (D). For all y E U, 

we define v g (-,;y) G V and v s h (-,y) E V/, by 

b(y;w,v 8 (-,y)) = (g,w) Vw e V , (46) 
%;w„,vf (-,y)) = {g,w h ) Vw/, g V ft . (47) 

Taking w = u(-,y) — Uh(-,y), we have 

G{u{-,y)-u h {-,y)) = (g,w(-,y) -u h {-,y)) = b(y;u{-,y)-u h {-j),v g (-j)) 

= b(y;u(-,y)-u h (-,y)y(-,y)-v g h ) , 

where in the last step we used Galerkin orthogonality b(y;u(-,y) — K/,(-,y),v?) = 0. 
Using the definitions of the bilinear form b(y; ■, •) and the norm || ■ \ \y^ sy , we obtain 

/ 2 \ '/2 

IW-<)lk, r = £ XT 1 ( , . / r u (x,(y u ;0))dx dy u ) , (48) 

\uC{l:.v} • / [-7-t] |u| JO / 

where 

ru (*> y ) := dy~ ( a (*>y) v ("- "ft) (*>y) -v(v g -vf ){x,y) 

For the remainder of this proof, we will use the short-hand notation <9 U for the mixed 
first partial derivatives with respect to the variables yj for j E u. We will also omit x 
andy in many places. From the special form of a(x,y) we see that 

r u (x,y) = a(x,y) d u (V(u - u h ) ■ V(/' - vg)) 

+ £ yr k (x) d u \ [k} ( V(« - «*) • V(v« - vf) 

fc€U 

= a(*,y) L ( M _ M/) ) ' v<5 u\d (v 5 - vf ) 

DCu 

fceu t>Cu\{*} 
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where in both terms we used the product rule <9 U (AB) = Y.x><zu(d\>A)(d u \ X) B). Thus 
/ r u (x,y)dx < a max £ \\d (u - u h )\\ v \\d u \ {v 8 - vf )||y (49) 

DCu 

+ E IIv^Hl^d) L ll 5 t)("-"A)||v||5( u \ W )\ (v g -vf)||v. 
fceu tjCu\{d} 

Denoting by ^ : V — s- V the identity operator and by ^ : V — > V - /, the correspond- 
ing FE projection, we can write 

H(«-«*)||v = \\dt,(S-& h )u\\ v = \\(S-& h )(d t ,u)\\ v 

< Ch> Wfay < Of WfWa-^ip) |t>|! ]J~bj , (50) 

where the first inequality follows from (IT~5b in Theorem [3] and the second inequal- 
ity follows from (f34-b in Theorem [6] Throughout, C > denotes a generic constant. 
Similarly, it follows from (|46| | and d47b that 

R\»(v*-vj)|| v = Wd^is-^ywv = \\(s-& h )(d u \y)\\ v 

< Ch'' R\ D vi z ,, < Cl/ \\g\\ H - l+t > {D) |tt\t>|! ]1 ^ ■ ( 51 ) 

./Gu\d 

Using (|50j> and (|5TJ, together with the identity £ D c u 1 1 ! |u\ 1 ! = ( |u| + 1 ) !, we obtain 
from(|49l 

/ r u (*,y)dx < C// + ''fl max ||/!| // - 1+ , (D) ||^|| H _ 1+ , (D) (|u| + l)! l\bj 
JD jeu 

+ CA' + ''||/|| ff -. +f(D) ||g|| H _ 1 ^ {D) £||%||L-(D)|u|! n *J 

£eu ;'eu\{fc} 

< C//+' a max ll/H 

l|G(-)]| H -t +( ' (fl) (M + l)!n^: 

;'eu 

where we used the estimate || II l°°(d) = a mmbk < ^max^t- Substituting this estimate 
into (l48b completes the proof. □ 

As we remarked earlier, if ^-orthogonality (f28b does not hold and if se > S£- \ , the 
second term in ( 126b is generally nonzero. We estimate it in the following result. 



Theorem 8 Under Assumptions (Al) and (A2), for every f G V*, every G(-) € V*, 
every h > 0, and every I > 1, 

iiGK'-«r)ik,, r 

< «™*m : / £ i(iui + i)f n^A ' 7 ^ 

where bj is defined in H2\ . 
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Note that if Assumptions (A3) and (A5) hold, then from (TT4T > we have Yff ,,b; 



Proof We follow a similar line of argument to the proof of Theorem [7j Let g £ V* 
denote the representer of G(-) £ V. For all y £ U, we define vf (•,)>) £ V/, as in ( l47l i. 
Taking w/, = u£(-,y) — m^ 1 (-,y) in (147k we have 

G(««(- l y)-u^- i c.y)) = U,«?(-.y)-"»" 1 (-,y)) 

= b(y;u s <(;y)-u s h ^(.,y),vi(;y)). 
Using the definitions of the bilinear form b(y;-,-) and the norm || • ||^j y , we obtain 



iiGK-«r)ik„ r 



I ru 1 

^uC{l,..., J{ } 



r u (*,(y u ;0))dx 



1/2 



(52) 



where 



As in the proof of Theorem [7] we will use the short-hand notation <9 U for the mixed 
first partial derivatives with respect to the variables yj for j £ u, and we will also omit 
x andy in many places below. From the special form of a(x,y) we see that 

r u (x,y) = fl(*,y)5 u (v(^-^- 1 )-Vvf) + £^W5 u \ W (v(^-^- 1 )-Vv« 

keu 



a{x,y) E ^ d *>( u h ~ u h > )- vd n\vK 



OCu 



- E WfcG*) E V<? » ( M A ~ M A ) ' Vd (u\{k})\x,v[ 
k£u »Cu\{i} 



Thus 



r u (x,y)dx 



< Ornax E ll^Ofc OllvlKxcVfllv 
DCu 

+ E 



(53) 



L-(D) E ' -«// 1 )llvlP(u\W)\D^I|v • 

X>Cu\{k} 



For any y £ U, u s H-,y) and 1 (-,y) are the solutions of the variational problems: 

(a"(-,y)V««(. J y),Vv) = (/,v) VveV A> 
(-,y)V^-' (.,,), Vv) = (/, v) Vv £ V A . 

Subtracting, we get (fl s '(-j)V^(-j)- fl s '-i (•,y)V M * f -' (-,y), Vv) = for all v £ V,„ 
or equivalently, 

(a^(-,y)V(^(-,y)-^- 1 (-,y)),Vv) 
= -((^(•,y)-^-H-,y))V«f 1 (- J y),Vv) VvGV A . 
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We interpret this as a weak problem for — vti 1 with the forcing term 

/:=v-((««(.,y)-^- 1 (-,y))v«?- 1 (-,y)) e v*. 

Hence we conclude from d33l l of Theorem|6] adapted to the finite element solutions, 
that 



|| W-^" 1 



<i.i.(n»,)^ 

\/'GD / "nun 



For any v G V , we obtain with integration by parts and the Cauchy-Schwarz inequal- 
ity, and Theorem[3] 



l(/,v)| 



< 



< 



V- {(a s <(x,y) -a s «(x,y))V^- 1 (^3')) v(*)d* 
J D (a Sl (x,y) -a s i-\x,y))Vuf l {x,y) ■ Vv{x)dx 

L~ ( D^\\ur{;y)h\Mv 



1 2 

Z y=%_i + l 



L ;=Jf-i+i 



v v 



1 Sf 



IvIHIv, 



which yields a bound on ||/||y», and in turn this gives 



<? (u 



v<\*Hn>j u e * 



yet) 



1 



2 " J 
J'^c-i+i 



11/11 



(54) 



Next, from ( f4Tb we can interpret vf as the solution of the weak problem (|9]i with 
the forcing term g. Thus we can apply (l33l l of Theorem|6]again to obtain 



Pu\ ^iiv < iu\t»i!( n b J 

<jeu\<o 



\\g\W* 



= t> 



n ^ 

y'eu\e 



l|G(-)lk* 



(55) 



Using (|54|, (|55l), and again £ c u |o|! |u\u|! = (|u| + 1)!, we obtain from d53l l 
r u (x,y)dx 

ll/llv*l|G(-)l|v 



< 



i st 



DCu 



y'eo 



J>|! mbj |u\t>|! n *V 



;6U\0 



fceu hCu\IB V /GO / \ i(=(u\tk\)\n / 



yeo 

ll/llv.||G(.)||v 



ll/||v||G(-)||v 



£Gu t)Cu\{it} 

1 t *,)(n*/ 

'y=sf-i+i 7 v y'eu 

<( £ bj)(Ubj 

v y'=jf-i+i J v y'£" 

Substituting this estimate into (l52l i completes the proof. 



ye(u\{t})\» 

amax(|u| + l)!+flmin|u||u|! 



amax(|u| + l)! 



□ 
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3.5 Error Analysis of the Multi-level QMC-FE Algorithm (Continued) 



We are now ready to estimate the two terms in (126l i for £ ^ 0. To bound the first term, 
we use the triangle inequality 



t , y <||G(««-«? 



\G{u sl -ul_ 



hl-\IWW Ht y 



and then apply Theorem|7]to both terms on the right-hand side. If ^-orthogonality (1281 ) 
does not hold and if sg ^ sg-\, we bound the second term in (TZoT i using Theorem [8] 
For the £ — term in (1251 ). we use the estimate 



HG(<)!k, r < 



,uC{1:jd} 



7u 



1 M 



l|G(0llv* 



5lu 



Uho 



(■,(y tt ;0)) 



1/2 



< 



ll/IH|G(Ollv* 



E 

,uC{l:i } 



[u|!) 2 n y e ufej 

7u 



2\ 1/2 



which follows from an adaptation of d33l from Theorem [6] Combining these esti- 
mates with (f22]i, d24j, (E3, and (ESJ), we obtain 



E[|/(G(«))-G^(-;G( M ))| 2 



<c 



^ll/ll ff -i +t ( )l|G(-)ll ff -i + ,' (D) + fe||/llv*l|G(-)llv»( £ ft. 



1 2 



1/A 



E tf[pW]' U| ] [<P(iVo)]-^||/|| 2 ^||G(.)|| 2 v* £ (|u|!)2n ^^ 



,0/uC{l: So } 



uC{l: i0 } 



7u 



E 7^[P(A)] |U| 



1/A 



[fl)(JV f )] 



-1/A 



£=1 \0^uC{l:i / } 

/l f-l l|G(-)ll // -l+r'( D ) 



|u| + i)!] 2 n 7 - eu ft- 



E2\ '/2 



■0/-i||/||v||G( 



E *j 



k uC{l:^} K« 

v / [(|u| + i)!] 2 n, eu ^ 2 



2\ I/* 



,UC{1:^} 



7u 



where we introduced the parameters i 6 {0, 1} for each level, analogously to d24b . 
to handle the case where fc-orthogonality d28l ) holds or when = 

When ^-orthogonality (l28T > does not hold, we further impose Assumptions (A3) 
and (A5) to make use of ( fT4b for estimating the tail sums of bj. These together with 
some further estimations lead to the following simplified mean-square error bound. 

Theorem 9 Under Assumptions ( A1)-(A6), for every feH- 1+t (D) with0<t< 1 
and every G(-) G H~ 1+t (D) with < t' < 1, f/te mean-square error of the multi-level 
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QMC-FE algorithm defined by ( 119b can be estimated as follows 



E[|/(G(«))-et(-;G("))| 2 ] < CD r (A)||/||2_ 1+f(D) ||G( 



e L s, 



-2(l/p-l) 



(=0 



-1/K 



'(D) 
-(I/P-I) 



where 



D r (X) 



E k>(a)] |u| 

l«l<- 



i)i] 2 iW*f 



7u 



(56) 



(57) 



w;?/z < T := / + /' < 2, = 0_i = 5_i = 0, p(A) as in d!7l i, one/ /5j as ;n d35l l. /n 
general we have 9( = 1 /or all£ = 0,...,L. If S( = Sf_ i /or some £ > 1 f/zen i = 0. 
When k- orthogonality ( 128b /zo/(is we /lave = 0/or all £ = 0,...,L. Assumptions 
(A3) and (AS) are not required when 9( = for all t. The expectation E[-] is with 
respect to the random compound shift which is drawn from the uniform distribution 
over [0, l] 4 '*. The error bound d56b is meaningful only ifDy(X) is finite. 



3.6 Choosing the Parameter A and the Weights y u 

Following [23), we now choose the weights y u to minimize Dy(X). We also specify 
the value of A to get the best convergence rate possible. Note that our goal is to have 
A as small as possible, since a smaller value of A yields a better convergence rate 
with respect to the number of QMC points. In the following theorem, the assumption 
is implied by Assumption (A7). 



Theorem 10 With bj defined as in (135b for fixed K S (0, 1], suppose that 

K < °o for some < q < 1 , (58) 

and when q = 1 assume additionally that 

E h < 2 ■ ( 59 ) 



For a given A G (1 /2, 1], the choice of weights 



x 2/(1+1) 



minimizes Dy{X) given in ( 157b , if a finite minimum exists. Moreover, the choice of X 
given by 

for some 8 £ (0,1/2) w/ien g £ (0,2/3] , 



2-25 

A = A ? := ^ -J— w hen qG (2/3,1) , (61) 

1 w/ien = 1 , 
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together with y u = Yu(Xq), ensures that Dy{X) < °°, and thus justifies the error bound 
(OS 

If we maintain the definition ( 16 11 1 ofX but instead of ( 1601 ) define the weights by 
Yu--= ^(M + lY.U(2bj)\ , (62) 

then Dy(X) is no longer minimized by this choice of weights, but Dy(X) < °° still 
holds provided that 8 < q/2 when q G (0,2/3]. 

We remark that the weight (l62l has a practical advantage over d60t . that with ( l62l it 
is not necessary to make a prior choice of X. 

Proof This proof follows closely the proofs of 11231 Theorems 6.4 and 6.5]. Apart 
from the simple replacement of bj by bj and of p by q, the main difference is that we 
now have to handle a sum containing the factor (|u| + 1)! instead of |u|!. For this we 
make use of l23l Lemma 6.3] with n = 1 instead of n = 0. 

Using |23l Lemma 6.2], we see that Dy(X) is minimized by choosing y u as in (l60l l 
for |u| < °°, provided that a finite minimum exists. The relative scaling of weights 
does not affect the minimization argument. Our choice of scaling here is consistent 
with the convention that "fy = 1 . 

In the course of our derivation below we eventually choose the value of X de- 
pending on the value of q, but until then X and q will be independent. For the weights 
given by d60l ). we have 

E (fu)*[pW] |u| = E [(|u| + i)!p/(^n(^p^)) 1/(I+A) - : ^, 

|u|<<*> |u|<oo y'Gu 



[(iu| + i)!] 2 n, eu ^ A 

L ~* - A A , 

|u|<<*> '« 

andthus£> r (A) =a\ /1+1 . 

For X £ (1 /2, 1), we have 2A/(1 + X) < 1 and we further estimate Ax as follows: 

we multiply and divide each term in the expression by Iljeu with (Xj > 

to be specified later, and then apply Holder's inequality with conjugate exponents 
( 1 + X ) / (2X ) and ( 1 + X ) / ( 1 - X ), to obtain 

ax= E[(iui + i)^ /(i ^n< /(1+A) n(^ 

|u|<»° ;'eu j'gu \ w j 



< E (|u| + i)!n«; 




bfp(X) 



|u|<~ jeu / \ |u|<~./'gu \ a j 
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which holds and Ax is finite, see 1231 Lemma 6.3], provided that 

g A 2V(l-A) 



2>,<1 and EItt) <°°- < 63 > 



b q 

— for some parameter U5 > E ^1 ■ (64) 
07 />i 



We now choose 



Then the first sum in ( 1631 is less than 1 due to the assumption (1581 1. Noting that 

implies that £/>i "j < 00 f° r a U </ > 9> we conclude that the second sum in 
converges for 

2A .„ . 2A . q 

■(1-9) >9 9<T-^ ^ A ^ 



1-A V *' - ' ' - 1+A 2-$ ' 

Since A must be strictly between 1 /2 and 1, when g £ (0,2/3] we choose X q = 1/(2 — 
25) for some 5 6 (0, 1 /2), and when g G (2/3,1) we set X q = q/(2 — q). 

For the case q = 1 we take A ? = 1, and we use p(l) = 1/4. Then using 
Lemma 6.3] and the assumption (l59l l we obtain 

E(H+i)!nfl)< Lv.\,.J 2 <°°- 



| U |«> 



Finally we show that Dy(A) < °° for A given by d6Tb and weights given by d62l . 
For the case q = 1 and A = 1, the weights (f60T > and d62l l are the same, so we need 
to consider only the cases q £ (0,2/3] and q £ (2/3, 1). To simplify the presentation 
below we introduce q' := A (2 — q). Then, with A given by d6Tb . with the additional 
restriction that 8 < q/2, it is easy to verify that q' = q foiq £ (2/3,1) and q < q' < 1 
for q £ (0,2/3]. In both cases, we have 



Oy(A) = E [(|u| + l)!] 9 'n((2^p(A)) E KM + WIT ^ 




2 2 ~i 

JGU / \ U <°° JGU 



For the first sum, we multiply and divide the terms by Wjeu a ] > w i m a j > to be 
specified later, and we apply Holder's inequality with conjugate exponents 1 jq 1 and 
1/(1 — q'). For the second sum, we multiply and divide the terms by Iljeu a j> with 
the same aj, and we apply Holder's inequality with conjugate exponents l/q and 
1/(1 -q). We obtain 



(W)(2-g) 



D r (^)< e (n+i)ti«> e n 

\|u|<°° ;'eu / \ |u|<oo;'gu 



(2^) 9 P(A) 



E(w+i)>rK' e n u^fe? 

| U |<oo ,/GU / V | u |<oo7GU \ z "/ 
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which is finite as above provided that 

j>\ j>\ \ u .i j j>\ \ u .i j 

and this can be achieved by choosing a ; - as in ( 1641 . since (1 — > q/(l — q). 
This completes the proof. □ 



3.7 Summary of Overall Cost Versus Error 
Recall that 

h e x 2~ l and M /lf x hj d x 2 W for ^ = 0, . . . ,L . (65) 

Based on the mean square error bound d56*l l. we now specify S( and N( for each level. 
We consider two scenarios depending on whether or not ^-orthogonality ( l28l ) holds. 

For our cost model we assume the availability of a linear complexity FE solver. 
The cost for assembling the stiffness matrix at level £ is 0(sgMf le ) in general, and is 
0{My l , log(M/, ( )) if ^-orthogonality ( f28l ) holds (see the second part of Theorem[5]). 
Moreover, we assume that the functions yfj are explicitly known, and that integration 
of any basis functions in the FE method against any yfj lS available at unit cost. Thus 

hj d log(hj d ) if ^-orthogonality (|28]l holds , 
hj d S£ otherwise . 

Clearly, changing the cost model may change the definition of K( . (Some cost models 
in the literature do not include S( as part of K(.) Note that our cost model does not 
include the pre-computation cost for the CBC construction of randomly shifted lattice 
rules, which requires 0{Ni logN(S£+Nesj) operations on level I. 

Scenario 1. In the special case where A:-orthogonality d28l ) holds, the values of are 
given by d29l l. and we have 9( = for all I in the error bound d56*l l. giving the mean 
square error bound (denoted in this subsection by error 2 for simplicity) 

error 2 = (V + £ l<p(Ni)]- 1/X h?_ x j . (66) 

Scenario 2. When ^-orthogonality d28l ) does not hold, we have 9l — 1 in the er- 
ror bound d56l l. To balance the error contribution within the highest discretization 
level, we impose the condition s L 2 ^^ p ^ = ^(hj), which is equivalent to si = 

Q( 2 LTp/(2-2p)y Then ^ tQ mimmize the 

error within each level, one choice for sg is to 
set S( = sl for all £ < L, leading to = for all £ — 1, . . . ,L in d56*l l. Alternatively, 
since S( should be as small as possible from the point of view of reducing the cost at 
each level, we can impose the condition s £ ij /p l) = 0{hj_ x ) for I = 1,...,L, which 
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is equivalent to S( = Q(2 ezp ^ 1 p >) for I = 0,...,L — 1. Combining both approaches, 
while taking into account the monotonicity condition (1181) . we choose 

st := min ( [z^ 1 "^] , [ 2 Lt "/( 2 - 2 ' j )] ) for I = 0, . . . ,L . (67) 

Thus we have sg strictly increasing for I = 0, . . . , [L/2J, and the remaining half of S( 
are all identical. Our choice of S( leads again to the error bound 1661 . 

We remark that for all N E N, the Euler totient function (p(N) takes values close 
to N. Specifically, if N is prime then l/(p(N) = l/(N—l) < 2/7Y. If TYis a power of 2 
then 1 /<p(N) =2/N. It is known from ij Theorem8.8.7] that 1 /<p(N) < (e r loglogA^H 
3/loglogAO/N for all N > 3, where e r = 1.781 .. .. Thus it can be verified that for 
all computationally realistic values of Af, say, < 10 30 , we have l/(p(N) < 9/N. 
Treating this factor 9 as a constant and using hi-\ >; lit, we obtain for both scenarios 
the simpler mean square error expression 




To minimize the mean square error for a fixed cost, we consider the Lagrange 
multiplier function 

gQi) := t^7 V V +M t N * K * ■ 

1=0 

mean square error 

We look for the stationary point of g(ji) with respect to N(, thus demanding that 

^ = 4^ 1/A -^ + ^0 for i = 0,...,L. 
This prompts us to define 

N ( := 



N (hf*K h?Ki^ l+1) 



for l=l,...,L. (68) 



Leaving A^o to be specified later and treating ho and Kq as constants, we conclude that 



error 2 




and cost = G ( N Y, E t ) » ( 69 ) 



f=0 



where 

E ■= (h 2 ^K \m+i) = {( h ? X ~ dl0 ^ h ~l d )y' {X+l) if Orthogonality dH) holds, 
l ' [( e) \{hf T - d s e ) l /^+ l '> otherwise. 

We see that the mean square error is not necessarily minimized by balancing the 
error terms between the levels. For example, when ^-orthogonality d28l i holds, we 
observe that 

- For d > 2Xt, the quantity E( (and thus the mean square error and cost at level t) 
increases with increasing I. 

- For d <2Xt, the quantity E( decreases with increasing I. 
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Scenario 1 ( continued). Substituting h( x 2 e , we obtain for the case where ^-orthogonality 
holds that 

£ £ ( < = <?( £ 2- < '( 2At - c/ V(^+i)(£ + i)i/(A+D j 
e=o \e=o J 

>(l) if J<2At, 

^(£(*+2)/(A+i)) ifd = 2AT, 

^( 2 -L(2^T-d)/(l+l) L l/(l+l)-) if rf > 2At ^ 

In light of the error bound in d69l i. we choose A^o to satisfy 

*o 1/A L^ = ^(Ai T ). ( 7 °) 



f=0 



which is equivalent to = £2(/! l 2t1 (X^ =0 £» ;i ). This yields the choice 

f[2 LT < 2A )J if £ /<2rA, 
iV := I [ 2 L < 2 %^+2)/(*+i)] ifd = 2TA, (71) 

|^ ^ 2 LT(d/T+2)l/(X+l) L l/(X+l^ ifd> 2tA _ 

Then we have error 2 = 6 (hjj) ■ Upon substituting (TTOb into the cost bound in ( f69b 
and using dTTT i, we obtain 



(2 iT < 2A ^+ 2 ) ifd = 2Xr, 



cost - ff(N^ +l)IX h 2 L z ) = I 0(2 L ^L X + 2 ) ifd = 2Xx, 



Scenario 2 (continued). When ^-orthogonality does not hold, we use the definition 
(|67| | for S( and denote for simplicity 



a := 



to obtain 




L 


/LV2J 




to 













z(2X-d/z)/{X+\) 



l=\L/Z\+\ 

Md/x <2X — a, 
if d/r = 2A — a , 

< ^(2 u ( a / 2+rf /( 2T )- A V(A + 1)) if 2A - a < d/x < 2X , 
^( 2 LT(a/2)/(A+l) L ) ; if J/t = 2A, 

^( 2 U(a/2+rf/T-2A)/( A + ^ if d/x > 2X . 
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As in Scenario 1 we choose No to satisfy i.e., No = ^ 2t1 (L^o^ 
this yields 



and 



r 2 iT(2A) L ^ 

r 2 LT(a/2 +c //T+2) A/(A + 1)L Aj 
r 2 iT(a/2+£//T+2)A/(^.+l)1 

Then we have error 2 = 0(1^) as before, but now 



if d/x < 2X — a, 
if d/x = 2X — a , 
if2X-a<d/x<2X, 
i£d/x = 2X, 
if d/x > 2X. 



(72) 



cost = ^(ivf +1)/A fcf) 



' ff(2 L < 2X ">) 

0(2LT(2X) L 1 + ^ 

e .(2Lr(a/2+d/T) L l + \^ 
g>(^LT(al2+dlx)^ 



ifd/x<2X-a, 
ifd/x = 2X — a, 
\i2X-a<d/x <2X, 
ifd/x = 2X, 
\fd/x>2X. 



In both scenarios, for given £ > 0, we choose L such that 



« — Li 



(73) 



We can then express the total cost of the algorithm in terms of e. This is summarized 
in TheoremfTTIbelow. 



Theorem 11 Under Assumptions (A1)-(A7), leaving out (A5) if k-orthogonality \ 
holds, for feH- l+t (D) and G(-) e '(D) with0<t,t' < 1 andx:=t + t' > 0, 
we consider the multi-level QMC-FE algorithm defined by dl9b - Given £ > 0, with L 
given by (173b , h( given by (165b , S( given by ( 167b (or ( 129b under k-orthogonality), Nc 
given by ( 1681 ), iVo given by ( 172b (or ( 1711 ) under k-orthogonality), and with randomly 
shifted lattice rules constructed based on POD weights y u given by ( 160b or ( 162b , we 
obtain 



and 
with 



v /e[|/(g(«))-G^(-;G(«))| 2 ] = 0(e) , 
cost(et) = 0(e 



-.ML 



max ( 2Xq, — 



2X, 



2-2p 



d 
2x 



max ( X qi — 



(loge l ) b 



if k-orthogonality d28l ) holds . 



if ->^~^ 

x 1 —p 



where X q is as defined in ( 1611 ). The value ofb ML can be obtained from the cost bounds 
in Scenarios 1 and 2 in a similar way. 
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In comparison, for the single level QMC-FE algorithm in l23ll to achieve 

SL 

error, its overall cost in the case of p < 1 is 0{e~ a ), with 



see Il23l Theorem 8.1], where X p is defined analogously to X q as follows 



Note that a is much smaller than a in most cases. This is clearly seen when 
X q ps Xp. However, in the extreme case where X q and X p are furthest apart, i.e, X q — 1 
and X p ps 1 /2, it is possible to come up with an example where a SL < a ML : indeed, 
we could take d=l,X = 2, q=l and p = 1 /3, which yield a SL ps 1 .75 while a ML = 2 
under ^-orthogonality. 

Now we compare with some multi-level MC and QMC works in the literature. 
Sometimes "finite-dimensional noise" is assumed, a feature we can mimic by setting 
p = q = in our analysis, leading to a ML = max(l/(l — S),d/z). In [3,5 ,35 1, multi- 
level MC FE methods for elliptic PDEs (G3 were analyzed, however with the random 
coefficient (0 being lognormal, i.e., the exponential of a stationary, Gaussian process. 

In l26l a class of abstract multi-level QMC algorithms for infinite-dimensional 
integration was introduced, with a general cost model for the evaluation of the inte- 
grand function. The multi-level structure in that paper is different from ours: the key 
difference being that our multi-level scheme must also incorporate the multi-level 
structure of the FE discretizations. Also new is the necessity of considering 'mixed' 
regularity (in weighted reproducing kernel Hilbert spaces with respect to the param- 
eter sequence^ and in the smoothness scale Z' with respect to the spatial variable x). 

In 121 a multi-level MC FE method with finite dimensional noise was analyzed. 
It was shown there that in domains Del 2 , a FE approximation of the expectation 
of the random solution with the convergence rate in the norm ofV (rather than 

for linear functionals of the solution) can be computed in ff(M/ lL ) = fflhj 2 ) work 
and memory, i.e., with the same cost as one multi-level solution of the deterministic 
problem. 

4 Conclusion 

This paper introduces a multi-level QMC FE method, applied to functionals of the 
solution of the same PDE with random coefficient problem as considered by |6 1. The 
same problem was studied by the present authors in [23], where we developed a single 
level QMC analysis which yielded the same error bounds as in |6| within the range 
of convergence rates relevant to QMC. The aim of the present multi-level version 
of the QMC approach is to develop a method which significantly reduces the costs, 
while maintaining the fast convergence (compared to MC) associated with QMC. We 
emphasize that the multi-level version requires a new analysis, and in particular leads 




for some 8 g (0, 1 /2) when p € (0,2/3] , 
when p g (2/3,1) . 
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to a new prescription for the POD weights (different from that in [23]) that determine 
the QMC rule. Another difference is that the regularity requirements on the functions 
are also more stringent than in the single level case. 

The principal results for dimension d = 2 are as follows. In Scenario 1 where 
^-orthogonality d28l i holds, if we can choose t = t' = 1 so that T = 2, and can choose 
A = 1/(2- 25) for some 8 € (0, 1 /2), then the cost of the multi-level QMC FE al- 
gorithm for computing the expectation of G(u) is ff{2 2L l^~ s }) = ff(fi^^ 1 while 
the convergence rate is the (best possible) second order &{2~ 2L ) = ^(hj). This cor- 
responds to optimal accuracy versus work bounds for the computation of solution 
functionals in first order FE methods applied to deterministic, H 2 regular, second 
order elliptic problems (see, e.g. J4j). In contrast, multi-level MC FE methods such 
as those analyzed in |3|5 1 cannot achieve optimal complexity for output functionals 
for general, sufficiently regular covariances of the random field a(x,y), due to the 
maximal convergence rate 1/2 of standard MC methods. 

As noted earlier, our cost model does not include the pre-computation cost for 
the CBC construction of lattice rules. This is justified because the same lattice rules 
can be used for the PDE problem with different forcing terms /. However, as we are 
tailoring the choice of weights to the problem, the cost of the CBC construction may 
be a significant issue. 

The present analysis was performed under Lipschitz assumptions on and a 
in (A4) and (A7) which, together with (A6) and the assumption that G(-) G L 2 (D), 
ensure in (|6]l that Z = (Hq C\H 2 )(D) and, in turn, implies ff{h 2 ) convergence in 1161 . 
The present convergence analysis extends directly to weaker assumptions: if in (A4) 
and (A7) we have only Holder continuity C 0,r (D) for some < r < 1 instead of 
W l '°°(D) regularity, or if D is not convex, then bj in ( 1351 ). ( I60T ) and ( l62l will depend 
on llvO'llcVfo) rather than on || Vj\\ w i,»( D y 

In Theorems [7] and [8] we considered only the weighted Sobolev space norm in- 
volving mixed first derivatives with respect toy, but Theorem[6]holds for higher order 
mixed derivatives. The results here can be extended by considering higher order QMC 
methods, see e.g. ifTUl Chapter 15]. 

Finally, in our multi-level scheme we assumed that exact expectations E[-] with 
respect to random shifts A( € [0,l] s< are available. In practical realizations, these 
expectations must be approximated by MC estimates E m[ [■] based on a finite number 
mi of i.i.d. realizations of the shift A( at discretization level I = 0, 1, ...,L. This leads 
to a further error (E — £m f )[ - ] in term £ of (l23l of order &(mj l ). We can maintain 
our error-versus cost estimates in with the same choices of paremters Sf> and 
N(, by taking m( — m* independent of I, that is, a level-independent, fixed number of 
random shifts A g for each level I. 
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